Skip to content

feat(github-body-field): rewrite one issue-body field out-of-context#412

Merged
potiuk merged 1 commit into
apache:mainfrom
potiuk:feat-github-body-field-tool
May 31, 2026
Merged

feat(github-body-field): rewrite one issue-body field out-of-context#412
potiuk merged 1 commit into
apache:mainfrom
potiuk:feat-github-body-field-tool

Conversation

@potiuk

@potiuk potiuk commented May 31, 2026

Copy link
Copy Markdown
Member

Summary

  • New tools/github-body-field/ Python tool that reads, parses, and rewrites a single ### Field section of a GitHub issue body in a subprocess — the body never crosses into agent context.
  • Saves ~5K tokens per single-field PATCH (and removes reporter-supplied content from a path it didn't need to be on). Bulk-sync over 20 trackers ≈ 300–500K tokens saved per run.
  • Migrates one call site in this PR (security-cve-allocate Step 4 item 1 — set CVE tool link) to prove the pattern. Remaining call sites in apply-and-push.md / signals-to-actions.md will follow in separate PRs.

CLI:

body-field get  <N> --field "<name>"
body-field set  <N> --field "<name>" --value "<v>"
body-field list <N>

Idempotent — a set whose value already matches the current is a no-op (no API write, no audit-log entry). Parser tracks fenced code blocks so a literal ### foo inside a shell snippet never false-matches as a heading.

Test plan

  • 41 unit tests (parser edge cases + CLI orchestration) — green
  • ruff check / ruff format / mypy — green
  • prek run --files <staged> — green
  • skill-and-tool-validate — green
  • lychee on the new README — clean
  • CI lychee + tests-ok on this PR

🤖 Generated with Claude Code

… loading the body into agent context

The security-sync skills currently update a single `### Field`
section of a tracker issue body by reading the full 10-15 KB body
into agent context, regex-editing one field, and writing the whole
body back. That spends ~5 K tokens per single-field flip, in
addition to looping reporter-supplied content (the most sensitive
content on the tracker) through the agent for no reason.

`tools/github-body-field/` is a stdlib-only Python tool that does
the read / parse / replace / push in a subprocess. The agent only
sees the diff summary on stdout. Subcommands:

  body-field get  <N> --field "<name>"     → prints one value
  body-field set  <N> --field "<name>" \
                      --value "<v>"        → rewrites in place
  body-field list <N>                       → prints all headings

Idempotent: a `set` whose value already matches skips the push
(saves an API call + an audit-log entry). Parser is a small state
machine tracking fenced code blocks so a literal `### foo` inside
a shell snippet never false-matches as a heading.

Migrates one call site in this PR — security-cve-allocate Step 4
item 1 (set *CVE tool link*) — to prove the pattern. The remaining
call sites (Public advisory URL, Remediation developer, Short
public summary for publish, Affected versions, etc.) will follow
in separate PRs.

41 unit tests cover the parser edge cases (fenced headings,
duplicate headings, empty values, last-section padding, idempotent
rewrites) and the CLI orchestration (dry-run, no-op short-circuit,
exit-code contract).
@potiuk potiuk merged commit 9c07ea6 into apache:main May 31, 2026
16 checks passed
potiuk added a commit that referenced this pull request May 31, 2026
…PATCH out of context (#424)

Every status-update-emitting skill (sync, import, allocate,
dedupe, fix) folds its update into one rollup comment per
tracker via the recipe in `tools/github/status-rollup.md`. The
recipe currently walks the agent through: fetch the comment
body, concatenate `<old body>` + ruler + new entry, PATCH the
comment. That loops the full rollup body — which grows
monotonically on long-running trackers — through agent context
on every sync pass.

`tools/github-rollup/` is the same shape as PR #412's
`github-body-field`: a stdlib + `gh` subprocess wrapper that
keeps the body out of agent context. CLI:

  github-rollup append <N> --action "<label>" --entry-body "..."
  github-rollup list   <N>
  github-rollup latest <N>

`append` auto-detects whether the issue already has a rollup
comment (via the marker prefix `<!-- airflow-s status rollup
v`); creates one if not, PATCHes the existing one if yes.
`--now ISO8601`, `--user @handle`, and `--dry-run` flags
support deterministic replay tests and pre-flight checks.

Parser is a small state machine that:

- recognises the rollup marker prefix (forward-compatible with
  future v2+ bumps),
- iterates `<details><summary>YYYY-MM-DD · @user ·
  Action</summary>...</details>` entries in document order,
- preserves entries whose summary doesn't follow the canonical
  shape (returned with blank fields so round-trip stays safe),
- tolerates missing close tags + trailing whitespace.

35 unit tests cover the parser (every edge case from
`status-rollup.md`'s spec — multi-entry, missing marker, non-
canonical summary, missing close tag) and the CLI orchestration
(append-existing, create-new, dry-run create vs append, user
override, all the exit-code contract paths, list, latest).

The migration of the 5+ call sites that currently reference
`tools/github/status-rollup.md` Step 2a will follow in
subsequent PRs once the tool's behaviour is verified in
practice. Adds workspace member + docs/labels-and-capabilities
row in lock-step (caught by `check-workspace-members` prek
hook).
potiuk added a commit to potiuk/magpie that referenced this pull request Jun 1, 2026
…ity-suite refactor patterns

Adds `optimize-skill` (capability:setup) — the refactoring sibling of
`write-skill`. It takes an existing framework skill (or sweeps a set)
and applies the five restructuring patterns proven on the security
suite, as behavior-preserving proposals gated by the validator
(green-before / green-after):

- split — slim an oversized SKILL.md into linked siblings (the apache#410
  pattern; addresses the PRINCIPLES.md P14 cap)
- config-lift — move concrete values into <project-config> (apache#386/apache#387/apache#388)
- out-of-context — read/PATCH one field without loading the body
  (apache#412 github-body-field, apache#424 github-rollup)
- fetch-upfront — batch per-item round-trips (apache#347)
- preflight-classifier — skip obvious no-ops before LLM passes (apache#414/apache#416)

SKILL.md is 297 lines; the pass catalogue (smell / exemplar PR /
mechanics / behavior-preservation guarantee / validation) lives in
the patterns.md sibling. Reads only framework-internal files, so no
injection-guard / Privacy-LLM callouts.

Ships a step-diagnose eval (5 auto-comparable cases incl. an
injection-resistance case) so the skill is not released without an
eval (P8). Wires the skill into the capability->skill map and the
eval index.

Generated-by: Claude Code (Opus 4.8)
potiuk added a commit to potiuk/magpie that referenced this pull request Jun 1, 2026
…ity-suite refactor patterns

Adds `optimize-skill` (capability:setup) — the refactoring sibling of
`write-skill`. It takes an existing framework skill (or sweeps a set)
and applies the five restructuring patterns proven on the security
suite, as behavior-preserving proposals gated by the validator
(green-before / green-after):

- split — slim an oversized SKILL.md into linked siblings (the apache#410
  pattern; addresses the PRINCIPLES.md P14 cap)
- config-lift — move concrete values into <project-config> (apache#386/apache#387/apache#388)
- out-of-context — read/PATCH one field without loading the body
  (apache#412 github-body-field, apache#424 github-rollup)
- fetch-upfront — batch per-item round-trips (apache#347)
- preflight-classifier — skip obvious no-ops before LLM passes (apache#414/apache#416)

SKILL.md is 297 lines; the pass catalogue (smell / exemplar PR /
mechanics / behavior-preservation guarantee / validation) lives in
the patterns.md sibling. Reads only framework-internal files, so no
injection-guard / Privacy-LLM callouts.

Ships a step-diagnose eval (5 auto-comparable cases incl. an
injection-resistance case) so the skill is not released without an
eval (P8). Wires the skill into the capability->skill map and the
eval index.

Generated-by: Claude Code (Opus 4.8)
potiuk added a commit that referenced this pull request Jun 1, 2026
…ity-suite refactor patterns (#427)

Adds `optimize-skill` (capability:setup) — the refactoring sibling of
`write-skill`. It takes an existing framework skill (or sweeps a set)
and applies the five restructuring patterns proven on the security
suite, as behavior-preserving proposals gated by the validator
(green-before / green-after):

- split — slim an oversized SKILL.md into linked siblings (the #410
  pattern; addresses the PRINCIPLES.md P14 cap)
- config-lift — move concrete values into <project-config> (#386/#387/#388)
- out-of-context — read/PATCH one field without loading the body
  (#412 github-body-field, #424 github-rollup)
- fetch-upfront — batch per-item round-trips (#347)
- preflight-classifier — skip obvious no-ops before LLM passes (#414/#416)

SKILL.md is 297 lines; the pass catalogue (smell / exemplar PR /
mechanics / behavior-preservation guarantee / validation) lives in
the patterns.md sibling. Reads only framework-internal files, so no
injection-guard / Privacy-LLM callouts.

Ships a step-diagnose eval (5 auto-comparable cases incl. an
injection-resistance case) so the skill is not released without an
eval (P8). Wires the skill into the capability->skill map and the
eval index.

Generated-by: Claude Code (Opus 4.8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant